Skip to content

Bump org.jsoup:jsoup from 1.21.2 to 1.22.2#1107

Merged
Brutus5000 merged 1 commit into
developfrom
dependabot/gradle/org.jsoup-jsoup-1.22.1
Jun 6, 2026
Merged

Bump org.jsoup:jsoup from 1.21.2 to 1.22.2#1107
Brutus5000 merged 1 commit into
developfrom
dependabot/gradle/org.jsoup-jsoup-1.22.1

Conversation

@dependabot
Copy link
Copy Markdown
Contributor

@dependabot dependabot Bot commented on behalf of github Jan 1, 2026

Bumps org.jsoup:jsoup from 1.21.2 to 1.22.2.

Release notes

Sourced from org.jsoup:jsoup's releases.

jsoup Java HTML Parser release 1.22.2

jsoup 1.22.2 is out now, with fixes and refinements across the library. It makes editing the DOM during traversal more predictable, refreshes the default HTML tag definitions with newer elements and better text boundaries, and improves reliability in parsing and HTTP transport. The release also fixes a number of edge cases in cleaning, stream parsing, XML doctype handling, and Android packaging.

jsoup is a Java library for working with real-world HTML and XML. It provides a very convenient API for extracting and manipulating data, using the best of HTML5 DOM methods and CSS selectors.

Download jsoup now.

Improvements

  • Expanded and clarified NodeTraversor support for in-place DOM rewrites during NodeVisitor.head(). Current-node edits such as remove, replace, and unwrap now recover more predictably, while traversal stays within the original root subtree. This makes single-pass tree cleanup and normalization visitors easier to write, for example when unwrapping presentational elements or replacing text nodes as you walk the DOM. #2472
  • Documentation: clarified that a configured Cleaner may be reused across concurrent threads, and that shared Safelist instances should not be mutated while in use. #2473
  • Updated the default HTML TagSet for current HTML elements: added dialog, search, picture, and slot; made ins, del, button, audio, video, and canvas inline by default (Tag#isInline(), aligned to phrasing content in the spec); and added readable Element.text() boundaries for controls and embedded objects via the new Tag.TextBoundary option. This improves pretty-printing and keeps normalized text from running adjacent words together. #2493

Bug Fixes

  • Android (R8/ProGuard): added a rule to ignore the optional re2j dependency when not present. #2459
  • Fixed a NodeTraversor regression in 1.21.2 where removing or replacing the current node during head() could revisit the replacement node and loop indefinitely. The traversal docs now also clarify which inserted nodes are visited in the current pass. #2472
  • Parsing during charset sniffing no longer fails if an advisory available() call throws IOException, as seen on JDK 8 HttpURLConnection. #2474
  • Cleaner no longer makes relative URL attributes in the input document absolute when cleaning or validating a Document. URL normalization now applies only to the cleaned output, and Safelist.isSafeAttribute() is side effect free. #2475
  • Cleaner no longer duplicates enforced attributes when the input Document preserves attribute case. A case-variant source attribute is now replaced by the enforced attribute in the cleaned output. #2476
  • If a per-request SOCKS proxy is configured, jsoup now avoids using the JDK HttpClient, because the JDK would silently ignore that proxy and attempt to connect directly. Those requests now fall back to the legacy HttpURLConnection transport instead, which does support SOCKS. #2468
  • Connection.Response.streamParser() and DataUtil.streamParser(Path, ...) could fail on small inputs without a declared charset, if the initial 5 KB charset sniff fully consumed the input and closed it before the stream parse began. #2483
  • In XML mode, doctypes with an internal subset, such as <!DOCTYPE root [<!ENTITY name "value">]>, now round-trip correctly. The subset is preserved as raw text only; entities are not expanded and external DTDs are not loaded. #2486

Build Changes

  • Migrated the integration test server from Jetty to Netty, which actively maintains support for our minimum JDK target (8). #2491

My sincere thanks to everyone who contributed to this release! If you have any suggestions for the next release, I would love to hear them; please get in touch via jsoup discussions, or with me directly.

You can also follow me (@jhy@tilde.zone) on Mastodon / Fediverse to receive occasional notes about jsoup releases.

jsoup Java HTML Parser release 1.22.1

jsoup 1.22.1 is out now, adding support for the re2j regular expression engine for regex-based CSS selectors, a configurable maximum parser depth, and numerous bug fixes and improvements.

jsoup is a Java library for working with real-world HTML and XML. It provides a very convenient API for extracting and manipulating data, using the best of HTML5 DOM methods and CSS selectors.

Download jsoup now.

Improvements

  • Added support for using the re2j regular expression engine for regex-based CSS selectors (e.g. [attr~=regex], :matches(regex)), which ensures linear-time performance for regex evaluation. This allows safer handling of arbitrary user-supplied query regexes. To enable, add the com.google.re2j dependency to your classpath, e.g.:
  <dependency>
    <groupId>com.google.re2j</groupId>
    <artifactId>re2j</artifactId>
    <version>1.8</version>
  </dependency>

(If you already have that dependency in your classpath, but you want to keep using the Java regex engine, you can disable re2j via System.setProperty("jsoup.useRe2j", "false").) You can confirm that the re2j engine has been enabled correctly by calling Regex.usingRe2j(). #2407

... (truncated)

Changelog

Sourced from org.jsoup:jsoup's changelog.

1.22.2 (2026-Apr-20)

Improvements

  • Expanded and clarified NodeTraversor support for in-place DOM rewrites during NodeVisitor.head(). Current-node edits such as remove, replace, and unwrap now recover more predictably, while traversal stays within the original root subtree. This makes single-pass tree cleanup and normalization visitors easier to write, for example when unwrapping presentational elements or replacing text nodes as you walk the DOM. #2472
  • Documentation: clarified that a configured Cleaner may be reused across concurrent threads, and that shared Safelist instances should not be mutated while in use. #2473
  • Updated the default HTML TagSet for current HTML elements: added dialog, search, picture, and slot; made ins, del, button, audio, video, and canvas inline by default (Tag#isInline(), aligned to phrasing content in the spec); and added readable Element.text() boundaries for controls and embedded objects via the new Tag.TextBoundary option. This improves pretty-printing and keeps normalized text from running adjacent words together. #2493

Bug Fixes

  • Android (R8/ProGuard): added a rule to ignore the optional re2j dependency when not present. #2459
  • Fixed a NodeTraversor regression in 1.21.2 where removing or replacing the current node during head() could revisit the replacement node and loop indefinitely. The traversal docs now also clarify which inserted nodes are visited in the current pass. #2472
  • Parsing during charset sniffing no longer fails if an advisory available() call throws IOException, as seen on JDK 8 HttpURLConnection. #2474
  • Cleaner no longer makes relative URL attributes in the input document absolute when cleaning or validating a Document. URL normalization now applies only to the cleaned output, and Safelist.isSafeAttribute() is side effect free. #2475
  • Cleaner no longer duplicates enforced attributes when the input Document preserves attribute case. A case-variant source attribute is now replaced by the enforced attribute in the cleaned output. #2476
  • If a per-request SOCKS proxy is configured, jsoup now avoids using the JDK HttpClient, because the JDK would silently ignore that proxy and attempt to connect directly. Those requests now fall back to the legacy HttpURLConnection transport instead, which does support SOCKS. #2468
  • Connection.Response.streamParser() and DataUtil.streamParser(Path, ...) could fail on small inputs without a declared charset, if the initial 5 KB charset sniff fully consumed the input and closed it before the stream parse began. #2483
  • In XML mode, doctypes with an internal subset, such as <!DOCTYPE root [<!ENTITY name "value">]>, now round-trip correctly. The subset is preserved as raw text only; entities are not expanded and external DTDs are not loaded. #2486

Build Changes

  • Migrated the integration test server from Jetty to Netty, which actively maintains support for our minimum JDK target (8). #2491

1.22.1 (2026-Jan-01)

Improvements

  • Added support for using the re2j regular expression engine for regex-based CSS selectors (e.g. [attr~=regex], :matches(regex)), which ensures linear-time performance for regex evaluation. This allows safer handling of arbitrary user-supplied query regexes. To enable, add the com.google.re2j dependency to your classpath, e.g.:
  <dependency>
    <groupId>com.google.re2j</groupId>
    <artifactId>re2j</artifactId>
    <version>1.8</version>
  </dependency>

(If you already have that dependency in your classpath, but you want to keep using the Java regex engine, you can disable re2j via System.setProperty("jsoup.useRe2j", "false").) You can confirm that the re2j engine has been enabled correctly by calling org.jsoup.helper.Regex.usingRe2j(). #2407

  • Added an instance method Parser#unescape(String, boolean) that unescapes HTML entities using the parser's configuration (e.g. to support error tracking), complementing the existing static utility Parser.unescapeEntities(String, boolean). #2396
  • Added a configurable maximum parser depth (to limit the number of open elements on stack) to both HTML and XML parsers. The HTML parser now defaults to a depth of 512 to match browser behavior, and protect against unbounded stack growth, while the XML parser keeps unlimited depth by default, but can opt into a limit via org.jsoup.parser.Parser#setMaxDepth. #2421
  • Build: added CI coverage for JDK 25 #2403
  • Build: added a CI fuzzer for contextual fragment parsing (in addition to existing full body HTML and XML fuzzers). [oss-fuzz #14041](google/oss-fuzz#14041)

Changes

  • Set a removal schedule of jsoup 1.24.1 for previously deprecated APIs.

Bug Fixes

  • Previously cached child Elements of an Element were not correctly invalidated in Node#replaceWith(Node), which could lead to incorrect results when subsequently calling Element#children(). #2391
  • Attribute selector values are now compared literally without trimming. Previously, jsoup trimmed whitespace from selector values and from element attribute values, which could cause mismatches with browser behavior (e.g. [attr=" foo "]). Now matches align with the CSS specification and browser engines. #2380
  • When using the JDK HttpClient, any system default proxy (ProxySelector.getDefault()) was ignored. Now, the system proxy is used if a per-request proxy is not set. #2388, #2390
  • A ValidationException could be thrown in the adoption agency algorithm with particularly broken input. Now logged as a parse error. #2393
  • Null characters in the HTML body were not consistently removed; and in foreign content were not correctly replaced. #2395
  • An IndexOutOfBoundsException could be thrown when parsing a body fragment with crafted input. Now logged as a parse error. #2397, #2406
  • When using StructuralEvaluators (e.g., a parent child selector) across many retained threads, their memoized results could also be retained, increasing memory use. These results are now cleared immediately after use, reducing overall memory consumption. #2411
  • Cloning a Parser now preserves any custom TagSet applied to the parser. #2422, #2423

... (truncated)

Commits
  • ac28afe [maven-release-plugin] prepare release jsoup-1.22.2
  • 52f2cd3 Improve entity example in changelog
  • cf6ffe0 Add Tag#TextBoundary option; bring TagSet to spec (#2493)
  • 2be739c Bump github/codeql-action from 4 to 4.35.1 (#2492)
  • 45de7cb Migrate integration test server from Jetty to Netty (#2491)
  • 1df14ed Preserve XML doctype internal subset
  • 06fa52d Adding Contribution Guide
  • d4a8941 Simplify the test; doesn't need the buffer
  • 823709f Don't reuse a fully read sniffed doc for StreamParser
  • e1b0df5 NodeFilter javadoc tweak
  • Additional commits viewable in compare view

Note
Automatic rebases have been disabled on this pull request as it has been open for over 30 days.

@dependabot dependabot Bot added dependencies Pull requests that update a dependency file java Pull requests that update java code labels Jan 1, 2026
@coderabbitai
Copy link
Copy Markdown

coderabbitai Bot commented Jan 1, 2026

Important

Review skipped

Bot user detected.

To trigger a single review, invoke the @coderabbitai review command.

You can disable this status message by setting the reviews.review_status to false in the CodeRabbit configuration file.


Comment @coderabbitai help to get the list of available commands and usage tips.

@Brutus5000
Copy link
Copy Markdown
Member

@dependabot rebase

@dependabot dependabot Bot changed the title Bump org.jsoup:jsoup from 1.21.2 to 1.22.1 Bump org.jsoup:jsoup from 1.21.2 to 1.22.2 Jun 6, 2026
@dependabot dependabot Bot force-pushed the dependabot/gradle/org.jsoup-jsoup-1.22.1 branch from cf4aced to ae6b5d6 Compare June 6, 2026 20:59
@Brutus5000
Copy link
Copy Markdown
Member

@dependabot rebase

Bumps [org.jsoup:jsoup](https://github.com/jhy/jsoup) from 1.21.2 to 1.22.2.
- [Release notes](https://github.com/jhy/jsoup/releases)
- [Changelog](https://github.com/jhy/jsoup/blob/master/CHANGES.md)
- [Commits](jhy/jsoup@jsoup-1.21.2...jsoup-1.22.2)

---
updated-dependencies:
- dependency-name: org.jsoup:jsoup
  dependency-version: 1.22.1
  dependency-type: direct:production
  update-type: version-update:semver-minor
...

Signed-off-by: dependabot[bot] <support@github.com>
@dependabot dependabot Bot force-pushed the dependabot/gradle/org.jsoup-jsoup-1.22.1 branch from ae6b5d6 to 69242dc Compare June 6, 2026 21:06
@Brutus5000 Brutus5000 merged commit 8f92f93 into develop Jun 6, 2026
3 checks passed
@Brutus5000 Brutus5000 deleted the dependabot/gradle/org.jsoup-jsoup-1.22.1 branch June 6, 2026 21:07
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

dependencies Pull requests that update a dependency file java Pull requests that update java code

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant